Picture for Yiming Yang

Yiming Yang

Kuaishou Technology

Unsupervised Diffusion Solver for Combinatorial Optimization via Combinatorial Adjoint Matching

Add code
May 29, 2026
Viaarxiv icon

Scaling World-Model Reinforcement Learning Through Diffusion Policy Optimization

Add code
May 25, 2026
Viaarxiv icon

Reinforcing Human Behavior Simulation via Verbal Feedback

Add code
May 19, 2026
Viaarxiv icon

Spend Less, Fit Better: Budget-Efficient Scaling Law Fitting via Active Experiment Selection

Add code
Apr 24, 2026
Viaarxiv icon

Spatiotemporal System Forecasting with Irregular Time Steps via Masked Autoencoder

Add code
Mar 26, 2026
Viaarxiv icon

Mind the Sim2Real Gap in User Simulation for Agentic Tasks

Add code
Mar 11, 2026
Viaarxiv icon

Learn Hard Problems During RL with Reference Guided Fine-tuning

Add code
Mar 05, 2026
Viaarxiv icon

GradAlign: Gradient-Aligned Data Selection for LLM Reinforcement Learning

Add code
Feb 25, 2026
Viaarxiv icon

Enroll-on-Wakeup: A First Comparative Study of Target Speech Extraction for Seamless Interaction in Real Noisy Human-Machine Dialogue Scenarios

Add code
Feb 17, 2026
Viaarxiv icon

PRISM: A Principled Framework for Multi-Agent Reasoning via Gain Decomposition

Add code
Feb 09, 2026
Viaarxiv icon